AITopics

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > California > San Diego County > Vista (0.04)
North America > Canada (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Neural Information Processing SystemsFeb-10-2026, 00:15:48 GMT

2a02b560822d564119fe3ac3be024ac6-Paper-Conference.pdf

computer vision, conference, proceedings, (10 more...)

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Neural Information Processing SystemsDec-24-2025, 14:13:35 GMT

A Simple yet Universal Framework for Depth Completion

Consistent depth estimation across diverse scenes and sensors is a crucial challenge in computer vision, especially when deploying machine learning models in the real world. Traditional methods depend heavily on extensive pixel-wise labeled data, which is costly and labor-intensive to acquire, and frequently have difficulty in scale issues on various depth sensors. In response, we define Universal Depth Completion (UniDC) problem. We also present a baseline architecture, a simple yet effective approach tailored to estimate scene depth across a wide range of sensors and environments using minimal labeled data.

artificial intelligence, machine learning, universal framework, (11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-10-2025

Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems

Salloom, Tony, Zhou, Dandi, Sun, Xinhai

Accurate three-dimensional perception is essential for modern industrial robotic systems that perform manipulation, inspection, and navigation tasks. RGB-D and stereo vision sensors are widely used for this purpose, but the depth maps they produce are often noisy, incomplete, or biased due to sensor limitations and environmental conditions. Depth completion methods aim to generate dense, reliable depth maps from RGB images and sparse depth input. However, a key limitation in current depth completion pipelines is the unrealistic generation of sparse depth: sparse pixels are typically selected uniformly at random from dense ground-truth depth, ignoring the fact that real sensors exhibit geometry-dependent and spatially nonuniform reliability. In this work, we propose a normal-guided sparse depth sampling strategy that leverages PCA-based surface normal estimation on the RGB-D point cloud to compute a per-pixel depth reliability measure. The sparse depth samples are then drawn according to this reliability distribution. We integrate this sampling method with the Marigold-DC diffusion-based depth completion model and evaluate it on NYU Depth v2 using the standard metrics. Experiments show that our geometry-aware sparse depth improves accuracy, reduces artifacts near edges and discontinuities, and produces more realistic training conditions that better reflect real sensor behavior.

artificial intelligence, depth completion, machine learning, (10 more...)

2512.08229

Country: Asia > China (0.15)

Genre: Research Report (0.82)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

arXiv.org Artificial IntelligenceNov-11-2025

HDCNet: A Hybrid Depth Completion Network for Grasping Transparent and Reflective Objects

Xie, Guanghu, Li, Mingxu, Wu, Songwei, Liu, Yang, Xie, Zongwu, Cao, Baoshi, Liu, Hong

Depth perception of transparent and reflective objects has long been a critical challenge in robotic manipulation.Conventional depth sensors often fail to provide reliable measurements on such surfaces, limiting the performance of robots in perception and grasping tasks. To address this issue, we propose a novel depth completion network,HDCNet,which integrates the complementary strengths of Transformer,CNN and Mamba architectures.Specifically,the encoder is designed as a dual-branch Transformer-CNN framework to extract modality-specific features. At the shallow layers of the encoder, we introduce a lightweight multimodal fusion module to effectively integrate low-level features. At the network bottleneck,a Transformer-Mamba hybrid fusion module is developed to achieve deep integration of high-level semantic and global contextual information, significantly enhancing depth completion accuracy and robustness. Extensive evaluations on multiple public datasets demonstrate that HDCNet achieves state-of-the-art(SOTA) performance in depth completion tasks.Furthermore,robotic grasping experiments show that HDCNet substantially improves grasp success rates for transparent and reflective objects,achieving up to a 60% increase.

artificial intelligence, depth completion, machine learning, (18 more...)

2511.07081

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceNov-11-2025

S2ML: Spatio-Spectral Mutual Learning for Depth Completion

Zhao, Zihui, Zhang, Yifei, Wang, Zheng, Li, Yang, Jiang, Kui, Geng, Zihan, Lin, Chia-Wen

Abstract--The raw depth images captured by RGB-D cameras using Time-of-Flight (TOF) or structured light often suffer from incomplete depth values due to weak reflections, boundary shadows, and artifacts, which limit their applications in downstream vision tasks. Existing methods address this problem through depth completion in the image domain, but they overlook the physical characteristics of raw depth images. It has been observed that the presence of invalid depth areas alters the frequency distribution pattern. In this work, we propose a Spatio-Spectral Mutual Learning framework (S2ML) to harmonize the advantages of both spatial and frequency domains for depth completion. Specifically, we consider the distinct properties of amplitude and phase spectra and devise a dedicated spectral fusion module. Meanwhile, the local and global correlations between spatial-domain and frequency-domain features are calculated in a unified embedding space. The gradual mutual representation and refinement encourage the network to fully explore complementary physical characteristics and priors for more accurate depth completion. Extensive experiments demonstrate the effectiveness of our proposed S2ML method, outperforming the state-of-the-art method CFormer by 0.828 dB and 0.834 dB on the NYU-Depth V2 and SUN RGB-D datasets, respectively. EPTH sensing is essential for various 3D tasks such as autonomous driving [1], robot navigation [2], [3], and scene reconstruction [4], [5]. However, raw depth images captured by current depth sensors, such as Time-of-Flight (TOF) and structured light devices like Microsoft Kinect [6] and Intel Realsense [7], often contain significant invalid areas. These regions arise from factors such as highly reflective or transparent surfaces and challenging lighting conditions. Zhang, Y ang Li and Z. Geng are with the Institute of Data and Information, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, China, 518071 (emails: zzh23@mails.tsinghua.edu.cn, Z. Wang is with School of Computer Science, Wuhan University, 430072, China (email: wangzwhu@whu.edu.cn). K. Jiang is with School of Computer Science and Technology, Harbin Institute of Technology, 150001, China (email: kuijiang 1994@163.com).

artificial intelligence, information, machine learning, (19 more...)

2511.06033

Country: Asia > China > Guangdong Province > Shenzhen (0.45)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.90)
Education > Educational Setting > Higher Education (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Neural Information Processing SystemsOct-9-2025, 21:40:56 GMT

2a02b560822d564119fe3ac3be024ac6-Paper-Conference.pdf

computer vision, depth completion, proceedings, (10 more...)

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > China > Guangxi Province > Nanning (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

arXiv.org Artificial IntelligenceAug-21-2025

ETA: Energy-based Test-time Adaptation for Depth Completion

Chung, Younjoon, Park, Hyoungseob, Rim, Patrick, Zhang, Xiaoran, He, Jihe, Zeng, Ziyao, Cicek, Safa, Hong, Byung-Woo, Duncan, James S., Wong, Alex

We propose a method for test-time adaptation of pretrained depth completion models. Depth completion models, trained on some ``source'' data, often predict erroneous outputs when transferred to ``target'' data captured in novel environmental conditions due to a covariate shift. The crux of our method lies in quantifying the likelihood of depth predictions belonging to the source data distribution. The challenge is in the lack of access to out-of-distribution (target) data prior to deployment. Hence, rather than making assumptions regarding the target distribution, we utilize adversarial perturbations as a mechanism to explore the data space. This enables us to train an energy model that scores local regions of depth predictions as in- or out-of-distribution. We update the parameters of pretrained depth completion models at test time to minimize energy, effectively aligning test-time predictions to those of the source distribution. We call our method ``Energy-based Test-time Adaptation'', or ETA for short. We evaluate our method across three indoor and three outdoor datasets, where ETA improve over the previous state-of-the-art method by an average of 6.94% for outdoors and 10.23% for indoors. Project Page: https://fuzzythecat.github.io/eta.

artificial intelligence, machine learning, prediction, (18 more...)

2508.05989

Country: Asia (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
(2 more...)

Neural Information Processing SystemsAug-20-2025, 06:30:52 GMT

Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion

Yiqi Zhong, Cho-Ying Wu, Suya You, Ulrich Neumann

CFCNet learns to capture, to the largest extent, the semantically correlated features between RGB and depth information.

dataset, depth map, international conference, (14 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > California > San Diego County > Vista (0.04)
North America > Canada (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Artificial IntelligenceAug-15-2025

Super LiDAR Reflectance for Robotic Perception

Gao, Wei, Zhang, Jie, Zhao, Mingle, Zhang, Zhiyuan, Kong, Shu, Ghaffari, Maani, Song, Dezhen, Xu, Cheng-Zhong, Kong, Hui

-- Conventionally, human intuition often defines vision as a modality of passive optical sensing, while active optical sensing is typically regarded as measuring rather than the default modality of vision. However, the situation now changes: sensor technologies and data-driven paradigms empower active optical sensing to redefine the boundaries of vision, ushering in a new era of active vision . Light Detection and Ranging (LiDAR) sensors capture reflectance from object surfaces, which remains invariant under varying illumination conditions, showcasing significant potential in robotic perception tasks such as detection, recognition, segmentation, and Simultaneous Localization and Mapping (SLAM). These applications often rely on dense sensing capabilities, typically achieved by high-resolution, expensive LiDAR sensors. A key challenge with low-cost LiDARs lies in the sparsity of scan data, which limits their broader application. T o address this limitation, this work introduces an innovative framework for generating dense LiDAR reflectance images from sparse data, leveraging the unique attributes of non-repeating scanning LiDAR (NRS-LiDAR). We tackle critical challenges, including reflectance calibration and the transition from static to dynamic scene domains, facilitating the reconstruction of dense reflectance images in real-world settings. The key contributions of this work include a comprehensive dataset for LiDAR reflectance image densification, a densification network tailored for NRS-LiDAR, and diverse applications such as loop closure and traffic lane detection using the generated dense reflectance images. Experimental results validate the efficacy of the proposed approach, which successfully integrates computer vision techniques with LiDAR data processing, enhancing the applicability of low-cost LiDAR systems and establishing a novel paradigm for robotic active vision-- LiDAR as a Camera . The dataset and code are available at: T o Be Updated.

artificial intelligence, machine learning, reflectance image, (15 more...)

2508.10398

Country:

Asia (0.46)
North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)